TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT’S t OR BOOTSTRAP CALIBRATION BE APPLIED?

نویسندگان

  • Jianqing Fan
  • Peter Hall
  • Qiwei Yao
چکیده

In the analysis of microarray data, and in some other contemporary statistical problems, it is not uncommon to apply hypothesis tests in a highly simultaneous way. The number, ν say, of tests used can be much larger than the sample sizes, n, to which the tests are applied, yet we wish to calibrate the tests so that the overall level of the simultaneous test is accurate. Often the sampling distribution is quite different for each test, so there may not be an opportunity for combining data across samples. In this setting, how large can ν be, as a function of n, before level accuracy becomes poor? In the present paper we answer this question in cases where the statistic under test is of Student’s t type. We show that if either normal or Student’s t distribution is used for calibration then the level of the simultaneous test is accurate provided log ν increases at a strictly slower rate than n as n diverges. If log ν and n diverge at the same rate then asymptotic level accuracy requires the average value of standardised skewness, taken over all distributions to which the tests are applied, to converge to zero as n increases. On the other hand, if bootstrap methods are used for calibration then significantly larger values of ν are feasible; we may choose log ν almost as large as n and still achieve asymptotic level accuracy, regardless of the values of standardised skewness. It seems likely that similar conclusions hold for statistics more general than the Studentised mean, and that the upper bound of n, in the case of bootstrap calibration, can be increased.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Or Bootstrap Calibration Be Applied?

In the analysis of microarray data, and in some other contemporary statistical problems, it is not uncommon to apply hypothesis tests in a highly simultaneous way. The number, N say, of tests used can be much larger than the sample sizes, n, to which the tests are applied, yet we wish to calibrate the tests so that the overall level of the simultaneous test is accurate. Often the sampling distr...

متن کامل

RELATIVE ERRORS IN CENTRAL LIMIT THEOREMS FOR STUDENT’S t STATISTIC, WITH APPLICATIONS

Student’s t statistic is frequently used in practice to test hypotheses about means. Today, in fields such as genomics, tens of thousands of t-tests are implemented simultaneously, one for each component of a long data vector. The distributions from which the t statistics are computed are almost invariably nonnormal and skew, and the sample sizes are relatively small, typically about one thousa...

متن کامل

Robustness and accuracy of methods for high dimensional data analysis based on Student’s t statistic

Student’s t statistic is finding applications today that were never envisaged when it was introduced more than a century ago. Many of these applications rely on properties, for example robustness against heavy tailed sampling distributions, that were not explicitly considered until relatively recently. In this paper we explore these features of the t statistic in the context of its application ...

متن کامل

Simultaneous spectrophotometric determination of ampicillin and penicillin in human plasma using multivariate calibration

An analytical methodology based on spectrophotometric and partial least squares (PLS) algorithm for thesimultaneous determination of ampicillin and penicillin in human plasma was developed and validated. Themultivariate model was developed as a binary calibration model and it was built and validated with anindependent set of synthesis and real samples in presence of matrix. It is shown how a de...

متن کامل

Bootstrap Tilting Diagnostics

The fundamental bootstrap assumption is that the bootstrap approximates reality; that the sampling distribution of a statistic under the empirical distribution F̂ approximates the sampling distribution under the true (unknown) distribution. A natural way to test this is to investigate how the bootstrap distribution varies when F̂ is replaced by other distributions. Iterated bootstrapping, jackkni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006